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Finite genome size can halt Muller's ratchet 
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Abstract: We study the accumulation of delete- may also play a significant role in bacterial or vi- 
nous mutations in a haploid, asexually reproducing ral dynamics, where host to host transmission of- 
population, using analytical models and computer ten creates severe bottlenecks ( |Chao 199C ; Andersson 
simul ations. We find that Muller's ratchet can come and Hughes 1996|) . Conventional wisdom says that 
to a halt in small populations as a consequence of Muller's ratchet proceeds at a constant rate, as a re- 
a finite genome size only, in the complete absence suit of which an asexual population may eventually 
of backward or compensatory mutations, epistasis, go extinct (Lynch and Gabriel 1990; Gabriel et al. 



or rec ombination. The origin of this effect lies in 1993] ) . The main assumptions that enter the classic 

the fact that the number of loci at which mutations ratchet model are a vanishing probability of back mu- 

can create considerable damage decreases with every tations, a genome with an infinite number of loci, and 

turn of the ratchet, while the total number of muta- mutations that do not interact. When epistatic inter- 

tions per genome and generation remains constant, actions are taken into account, Muller's ratchet will 

Whether the ratchet will come to a halt eventually slow down over time (|Charlesworth et al. 1993 ; Kon- 

depen ps on the ratio of the per-locus deleterious drashov 1994 ), though it will not necessarily come to 
mutation rate u and the selection strength s. For 
sufficiently small u/s, the ratchet halts after only a 



a halt ( Butcher 1995|) . On the other hand, if back mu- 
tations are assumed to be frequent and the genome 
is finite, then the ratcheting process will always stop 
eventually, even in the absence of epistatic interac- 
tions ( [Woodcock and Higgs 199€ ; Priigel-Bennett 
Keyw ords; MnlW's ratchet, mutation accumulation ; 1997| )- A similar result can be found if there is 
popul ation bottlenecks, RNA viruses, compensatory sufficient supply of c ompensatory mutations ([Wag- 



few clicks. We discuss the implications of our results 
for bacterial and virus evolution. 
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ner and Gabriel 199C| ). However, since the majority 
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of studies focuses on the classic model with infinite 
genome QHaigh 197$ ; |Pamilo et al. 1987] ; |Stephan| 
et al. 1993; Higgs and Woodcock 1995; Gessler 



rntrlipt, i e , the rnntiminl less of thonn i-w44 



1995 ; |Charlesworth and Charlesworth 1997 ; Gordo 



~r ~u ' TT ~T 7 ~7 7i ', and Charlesworth 2000a|; IGordo and Charlesworth 
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load If mutation, (puller 196d| ; [Fblionstoin 107d| ). 20 B)' ' lt is not entirely clear how Muller ' s ratchet 



Sexual populations can regenerate individuals with 
reduced number of mutations through recombina- 
tion, while asexual populations cannot. This may 
be one of the main advantages of sexual recombina- 
tion (Felsenstein 1974; Maynard Smith 1978; Taka- 



hata 82| ;~Pamilo et al. 1987; Antezana and Hud 



son 1997). Besides its importance for theories of the 



evolution and maintainance of sex, Muller's ratchet 



affects bacteria or viruses, which contain at most a 
couple of hundred genes. 

Here, we investigate the selection-mutation balance 
for a finite number of loci, and we also consider ar- 
bitrary forward and back mutation rates. Perhaps 
not so surprisingly, we find that if the back mutation 
rate is small but non-zero, selection-mutation balance 
stabilizes the population before the genome is com- 
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pletely deteriorated. More importantly, however, the 
assumption of a finite number of loci alone guarantees 
that even in the complete absence of back mutations, 
Muller's ratchet can come to a complete halt within 
a biologically relevant time frame if selection is suffi- 
ciently strong. 

Infinite population size 

As in the case of an infinitely long genome, the de- 
terministic mutation-selection balance of an infinite 
population is a useful basis for the subsequent anal- 
ysis of finite population effects flHaigh 197SQ . We 
therefore give a brief treatment of the infinite popu- 
lation case, and discuss the influence of various rates 
of back mutations on the equilibrium distribution of 
deleterious mutations. 

The deterministic mutation-selection equilibrium 
can be calculated straightforwardly from the eigen- 
vector corresponding to the largest eigenvalue of the 
transition matrix W ( [Eigen et al. 1988 ). The tran- 
sition matrix contains the rates at which genotypes 
are produced as offspring of other genotypes, i.e., an 
entry Wij in the transition matrix is given by the fit- 
ness of genotype j times the probability that geno- 
type j mutates into genotype i. In the present case, 
the analysis can be simplified because we disregard 
epistatic interactions, in which case it is sufficient to 
calculate the equilibrium point for a single locus. The 
generalization to a finite number of loci L is trivial 



(iRumschitzki 1987| ). 

We assume a single locus with two alleles, '+' and 
'— '. The '+' allele carries a fitness of 1, and the '— ' 
allele carries a fitness of 1 — s. Mutations from '+' 
to '— ' may occur with a rate u per locus. We call 
these mutations forward (or deleterious) mutations. 
Conversely, mutations from ' — ' to '+' occur with a 
rate v, and we call these mutations back mutations. 
The transition matrix for this single-locus, two-allele 
model is given by 



W 



1-u 



u 



(1 - s)v 
(l-s)(l- v )J ■ 



(1) 



From diagonalization, we find that the '— ' allele is 
present in the population at a concentration of 



1 

a = — 
2 



1 - v + 



u + v 



1 



u + v \ 2 4u 



The concentration of the '+' allele is correspondingly 
1 — a. In the case of L loci, this result generalizes to 



Xk 



a k (l 



\L-k 



(3) 



where is the concentration of genomes that carry 
k alleles of the '— ' type (we will in the following sim- 
ply say that these genomes carry k mutations) . From 
(H), the average number of mutations (k) in the pop- 
ulation follows as (k) = aL. 

If we set the back mutation rate to v = 0, we find 
the simple expression 



L 



1 for u > s 
u/s for u < s. 



(4) 



We learn that the fate of an infinite population with- 
out back mutations is determined by the ratio be- 
tween the rate of deleterious mutations per locus u 
and the selection strength s. When u is of the order 
of s or larger, then (k) = L, which means that the 
population will eventually consist exclusively of indi- 
viduals that have been hit by mutations at all loci. 
For u smaller than s, on the other hand, the majority 
of individuals in the population will accumulate only 
a small number of mutations. 

In Fig. [l], we show the average fraction of mutated 
loci in the population, (k)/L, as a function of the 
forward mutation rate u and for various back muta- 
tion rates v. Note the sharpness of the transition at 
u/s = 1. If u is only a factor of 10 smaller than s, 
then on average each individual carries mutations in 
only 10% of the loci. We observe further that the 
ratio of u/s plays a more important role in determin- 
ing the equilibrium point of the population than the 
back mutation rate v. Unless v is of the order of u, 
the deviations from the case of v = are small. 

It is worth mentioning that if we substitute a = u/s 
into (||) and take the limit L — > oo while keeping the 
genomic mutation rate U = uL constant, we recover 
the Poisson distribution of mutants that is normally 
assumed in models of Muller's ratchet with infinite 
genome size ( Haigh 1978; ). 



(2) 



Finite population size 
Simulation methods 

All simulation results reported in this work were ob- 
tained in the following manner. We keep track of 
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L + 1 variables n,, each of which counts the number 
of individuals with i mutations currently present in 
the population. In order to propagate the population 
through one round of selection and mutation, we cre- 
ate a new set of variables n.j(t+l) from the current set 
rii(t). Initially, all rii{t + 1) are set to zero. We then 
repeat the following steps N times: We choose a mu- 
tation class i at random for replication, according to 
probabilities pi that are proportional to (1 — s) l rii{t). 
We then test i times for backward mutations, with 
probability v, and L — i times for forward mutations, 
with probability u. The mutation class j of the new 
individual is then i minus the successful back muta- 
tions plus the new forward mutations. The variable 
rij(t + 1) is correspondingly incremented by one. Af- 
ter TV repetitions of these steps, we have fully assem- 
bled the population of time t + 1, and we begin again 
with a new set of variables rii(t + 2), and so on. The 
distribution at t = is always no(0) = N, nj(0) = 
for all i > 0. 

Moment expansion 

For the case of an infinite population, we have seen 
that no significant accumulation of mutations occurs 
if the selective strength is larger than the per-locus 
deleterious mutation rate. No back mutations are 
necessary to maintain this balance between selection 
and mutation in an infinite population, because the 
wild type (with all loci carrying the '+' allele) is 
never lost. The equilibrium point is then simply a 
function of the rate at which mutants are produced 
from the wild type, and the rate at which the wild 
type replicates in comparison to the replication rate 
of the mutants. The case of a finite population is fun- 
damentally different, because there the least loaded 
class can be lost. 

We have calculated the average number of mu- 
tations in a population using a moment expansion 



technique previously used by [Woodcock and Higgs 
(1996(]- In 

essence, we have repeated their calcu- 
lations, but have keept the forward and the back- 
ward mutation rates independent, instead of assum- 
ing u = v. The details of these calculations are given 
in the Appendix. 

We compared our analytical result with numeri- 
cal simulations. We found that our approximations 
worked well for selective values s of the order of u or 
smaller (Fig. |2|), but did not give a reasonable esti- 



mate when s > u. This is not too surprising, given 
that the calculations are done under the assumption 
that s is small. Unfortunately, however, the interest- 
ing parameter region is s > u, for which the infinite 
population model allows for a non-trivial equilibrium 
even in the absence of back mutations. We therefore 
had to develop an alternative analytical description 
that would be more suitable for s > u. 

Effective genome size model 

In this section, we focus on the case of no back mu- 
tations, v = 0. From a mathematical point of view, 
a finite population without back mutations will al- 
ways hit the absorbing boundary at (k) = L even- 
tually, i.e., it will always lose all '+' alleles as time 
t — > oo. Therefore, we know that any set of equations 
aimed at finding the true equilibrium point of such 
a system is bound to yield (k) = L. However, this 
solution has no biological significance if the waiting 
time to arrive at (k) = L is extremely large, e.g., of 
the order of millions of generations or larger. Con- 
sequently, we must try to find a description for the 
increase in waiting times between successive clicks, 
and may assume that the ratchet has stopped once 
the waiting times exceed a biologically relevant time 
scale. Below follows a mathematical description for 
the waiting times that, while being too simplistic to 
yield accurate quantitative results for very small N, 
describes well the qualitative behavior that we see in 
simulations, and gives valuable insight into the na- 
ture of Muller's ratchet in finite genomes. 

We assume that the number of sequences in the 
least loaded class no is given by the infinite popula- 
tion concentration times the population size N, and 
that loss of the least loaded class occurs if all indi- 
viduals in that class do either fail to reproduce or 
give birth to mutated offspring. The situation of a 
population with a least loaded class having k m \ n mu- 
tations is equivalent to one in which the least loaded 
class has zero mutations, but the genome is of length 
L — k m [ n . In the following, we will therefore refer 
to L — fc m ; n as the effective genome size, and to the 
model that we describe now as the "effective genome 
size model". 

From (ph, we find 



L-k„ 



n = N(l - a) 

~ ]\[ e a(kmin-L) 



(5a) 
(5b) 
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The probability £ that an organism in the next gen- 
eration is an offspring of one of the no organisms in 
the least loaded class is given by [with X{ clS defined 
in (I)] 

= (n /N)(l-s) k ^ _ 

Si 

The probability that there will be j descendants of 
the least loaded class in the next generation can be 
expressed in terms of £ as 

Pbirth(i) = (^)e(i-o N ' j - (7) 

The probability that all these j organisms carry at 
least one additional mutation is given by 

Pmut(i) = [l-(l-«) L - femin F'. (8) 

The total probability of a ratchet event then follows 

as 

N 

Pratchct = y^jwQ')PbirthC?) 

= [1 - £(1 - u) 1 ^™]^ (9a) 
« exp[-iVf (1 - u) L " femin ] . (9b) 

We find that Ratchet decays exponentially with the 
two quantities N£ and (1 — u) L ~ kmhx . The lat- 
ter quantity grows exponentially with every turn of 
the ratchet, i.e., (1 — u) L ~* min increases exponen- 
tially as the effective genome length L — k m \ n de- 
creases. Moreover, iV£ is proportional to no, which 
again increases exponentially with decreasing effec- 
tive genome length.. As a result, Pratchet 

decays su- 

perexponentially as the ratchet turns. This implies 
that the ratchet will turn at a relatively high rate 
initially, but can suddenly reach a point where the 
waiting time exceeds any biologically relevant time 
scale. This is dramatically illustrated in Fig. ||. Note 
the logarithmic time scale. Within the first 1000 gen- 
erations, 70% of the genes were lost, while during the 
next 10 5 generations, only an additional 10% of the 
genes were lost. Moreover, in the final 50000 genera- 
tions, we observed hardly any clicks of the ratchet at 
all. 

Figure || gives also a comparison between simula- 
tion results and our theoretical model. The theo- 
retical prediction was derived from @ by assuming 



that the average waiting time till the next ratcheting 
event is given by the inverse of p ra tchet , so that the 
time until a certain k m \ a is reached is simply given 
by the sum of the inverse ratchet probabilities from 
the first ratchet event until the (k m in — l)th ratchet 
event. When comparing the results thus obtained 
to the simulation results, we find that although the 
model predicts a too early slowdown of the ratchet, 
the qualitative form of the predicted curve agrees 
well with the one from the simulations. This demon- 
strates that our main reasoning about the increase 
in waiting times between successive ratchet events 
is correct, even though the microscopic details are 
somewhat incorrect. The origin of the deviations be- 
tween model and simulations is clear. At the begin- 
ning of the simulation, we initialize all sequences to 
the wild type with zero mutations, while the model 
always assumes a fully developed mutation-selection 
equilibrium. The population in the simulation thus 
retains at early times a larger fraction of individu- 
als in the least loaded classes than what the model 
assumes, and the model overestimates the transition 
probabilities. For later times, the model underesti- 
mates the transition probabilities, because it disre- 
gards cases in which the loss of the least loaded class 
happens in more than one step, for example, cases in 
which the number of individuals in the least loaded 
class first fluctuates to an unusually low value be- 
fore the currently least loaded class disappears com- 
pletely. 

In Fig. |, we show another comparison between our 
model and the simulation results. There, we consider 
the average number of mutations in the population, 
(k), instead of the number of mutations in the least 
loaded class, k m \ n , so that we can compare our re- 
sults to the moment expansion as well. In the effec- 
tive genome size model, we calculated (k) from k m \ n 
with the formula (k) = (L — k m \ n )a. Both the effec- 
tive genome size model and the moment expansion 
agree on the mutation rate at which Muller's ratchet 
starts to slow down. However, the form of the tran- 
sition to a mutation free population is significantly 
mis-represented by the moment expansion, while the 
effective genome size model predicts the qualitative 
form of the transition well. In addition, the moment 
expansion does not yield any insights into the tempo- 
ral change of the ratchet rate. The effective genome 
size model is therefore the more useful one, despite 
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its mathematical simplicity. 
Simulation results 

For population sizes below about one hundred, the 
effective genome size model consistently predicts too 
low ratchet rates. This is not surprising, as our model 
suffers from the same limitations that infinite genome 
models do when uq < 1 ( Pessler 1995| ), namely in- 
sufficient equilibration of the least loaded class. We 
therefore have to resort to simulation results for very 
small population sizes. Fig. || shows the simulation 
data from Fig. || for N = 500, and in addition re- 
sults for N = 50 and N = 10. All three cases show 
a very similar behavior, but shifted to smaller and 
smaller mutation rates. For sufficiently small muta- 
tion rates, Muller's ratchet stops after the first few 
clicks. In a transition region that covers roughly one 
order of magnitude in the mutation rate, Muller's 
ratchet stops at intermediate points, with a good 
fraction of genes lost, but another large fraction of 
genes retained. When the mutation rate is too high, 
all genes are lost within a biologically relevant time 
scale. 

In our analytical model presented above, we have 
assumed that in order to predict the probablity of the 
next click of the ratchet, it is sufficient to know the 
effective length L — k m \ n , while the absolute values 
of L and fc m i n do not enter the calculation. If this 
assumption is correct, then for any given mutation 
rate u, the value of L — k m [ n at which the ratchet 
stops should be independent of L. In other words, 
the per-locus mutation rate u determines how many 
loci can at most be retained unmutated, and the ge- 
nomic mutation rate uL is largely irrelevant. Fig. |6| 
demonstrates that this reasoning is consistent with 
our simulation results. When we plot the difference 
between the length L and the average number of mu- 
tations (k) as a function of u, the results for widely 
differing lengths lie right on top of each other in the 
regime in which substantial mutation accumulation 
occurs. 

We have also performed a number of simulations 
with a non-zero rate of back mutations v (data not 
shown). The main result is similar to the case of an 
infinite population, as depicted in Fig. |]. Unless the 
back mutation rate is of the same order of magnitude 
as the forward mutation rate, back mutations do not 



accumulate. The ratio u/s and the population size 
N are the main determinants of whether genomes ac- 
cumulate a fair amount of mutations, or stay largely 
mutation free. 

Discussion 

Two main differences between Muller's ratchet 
in infinite and in finite genomes emerge from our 
study. First, the ratchet rate is constant for infinite 
genomes, whereas the rate decays with every turn of 
the ratchet in a finite genome. The slowdown of the 
ratchet rate is so dramatic that the ratchet can stop 
completely, even in the absence of epistasis or com- 
pensatory mutations. Second, in an infinite genome, 
the main parameter that governs the ratchet rate, 
besides the population size, is the ratio 9 between 
the genomic deleterious mutation rate and the selec- 
tion strength (Haigh 1978). In a finite genome, on 
the other hand, it is not the genomic but the per 
locus deleterious mutation rate that we have to com- 
pare to the selection strength. As a consequence, 
Muller's ratchet is not as important a limiting factor 
in the evolution of large genomes as was previously 
thought ( Maynard Smith 1978] ). An additional con- 
sequence is that we can expect organisms with longer 
genomes and more sophisticated error correction to 
be less prone to mutation accumulation than organ- 
isms with shorter genomes, even if the mutation rates 
per genome are comparable. 

Backwards or compensatory mutations result in an 
additional slowdown of the ratchet. However, un- 
less they occur at a rate comparable to the forward 
rate, they do not play a significant role in determin- 
ing whether the ratchet stops early, or continues until 
a large proportion of the genome is deteriorated. It is 
therefore adequate to neglect them for order of mag- 
nitude estimates of the ratchet rate. Nevertheless, a 
complete theoretical description of the ratchet rate, 
including a finite genome size and variable forward 
and back mutation rates, is certainly desirable but 
completely lacking at this point. 

The range of parameters that case Muller's ratchet 
to halt early is certainly biologically plausible. For 
example, Andersson and Hughes (1996| ) estimate the 
mutation rate in Salmonella typhimurium to 0.0014- 
0.0072 mutations per genome per generation, in a 
genome of about 200 genes ( Riley 1993| ). |Andersson 



have i large effect on the rate at which mutations and Hughes do not estimate the selection strength s 
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Howe 1 r er, selection is certainly not weak, considering 
that out of the five lineages in which fitness loss was 
observed, two experienced a doubling of their gener- 
ation time, while the other three had an increase in 
generation time of about 10-15%. Given the low mu- 
tation rate and the number of bottlenecks of only 60, 
we can estimate that this loss in fitness was probably 
due to only a small number of mutations, maybe be- 
tween one and three. Of course, a more accurate es- 
timate of the selection strength in this system would 
be desirable. In any case, the parameters of the sim- 
ulations in Fig. [5] (s = 0.1 and L = 100) are prob- 
ably of the right order of magnitude, and hence we 
find that while bottlenecks of size one will lead to 
Muller's ratchet, bottleneck sizes above 10 or 50 will 
not lead to persistent genome deterioration. There- 
fore, if the bottlenecks encountered during transmis- 
sion from one host to another are typically between 
10 to 100 individuals, bacterial populations may not 
suffer significantly from mutation accumulation over 
time. 

In the case of RNA viruses, where mutation rates 
are much higher, it seems that the risk of mutation 
accumulation should be higher as well. Since suffi- 
ciently large virus populations can exist without loss 
in fitness, however, we can assume that u/s <C 1, 
so that Muller's ratchet becomes important only for 
very small populations. Most experimental work fo- 
cuses on bottleneck sizes of one (|Chao 199q; Puarte| 



and Chao did not observe a further decline in fitness 



et al. |1992| ; [Elena et al. 1996| ; |de la Peha et al. 2000| ) 



in which case loss of fitness is readily observed after 
a couple of serial transfers. However, these works do 
not address intermediate bottleneck sizes between 10 
and 100, or the change in fitness over time. Therefore, 
they do not allow a full assessment of the importance 
of Muller's ratchet in virus evolution. An impor- 
tant step towards more conclusive experimental re- 
sults has been presented in recent work by Chao and 
coworkers ( |Chao et al. 199*7 ; Burch and Chao 1999 



Burch and Chao 2000). There, a wide range of popu- 
lation sizes has been investigated, and in particular in 
Burch and Chao (1999| ), fitness measurements have 
been performed after every bottleneck. Propagation 
of lineages of a strain with impaired fitness showed re- 
covery back to the original fitness level for bottleneck 
sizes of N = 33 or larger (Fig. 3 of Burch and Chao 
1999(J] For N = 10, the original fitness could not 



be regained within 100 generations. However, Burch 



at N = 10 either, instead they observed a slight fit- 
ness increase. Burch and Chao| 's results for large bot- 
tlenecks demonstrate that compensatory mutations 
are readily available for sufficiently large population 
sizes, so that a virus population can easily recover 
a short sequence of extreme bottlenecks if later it is 
allowed to expand again. Their results for N = 10 
suggest two alternative interpretations. On the one 
hand, compensatory and deleterious mutations may 
cancel each other almost exactly, so that the net re- 
sult is a small fitness increase. On the other hand, the 
impaired virus strain may already have reached the 
point at which Muller's ratchet stops to operate for 
N = 10. In that case, the lineage is protected from 
further mutation accumulation, and can safely exist 
until some compensatory mutations occur eventually. 
We believe that the second explanation is the more 
accurate one, but the current data does not allow to 
reject one of the two scenarios conclusively. More 
data at small bottleneck sizes (between one and 10) 
and for longer times should allow a more accurate 
assessment of these issues. 
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Appendix 

The following calculation is directly analogous to 
the one presented by Woodcock and Higgs (199(f ) We 
refer to their work for a more complete description of 
the necessary steps. 

Assume that an individual i carries k{ deleterious 
mutations and creates an offspring with j = k{ + 
rrii deleterious mutations. The probability that an 
offpring carries exactly j mutations is given by the 
transition matrix 

min(fc,Z— j) 



M 



3 k 



i=max(0,fc— j) 



k 
i 

\k—i 



I - k 
j — k + i 

l-j-i 



v u 



j — k+i 



x (1 -vf~ l {l -u) 
The expectation values of rrii and mf are 
E(mi) = yXi - k i)M jki = u(l - k) - vk , 



(10) 



j 

+ k[-u 2 



h) 2 M jki 

- v 2 + (v - 
+ u(l -u)L + u 2 L 2 



k 2 {u + vf 



u) + 2u l 



2uL(u + v)] 
(12) 



Both results reduce to the ones found by Woodcock 
and ifiggs if we set v = u. Now consider two in- 
dividuals i and j with fitnesses Wi = (1 — s) ki and 
Wj = (1 — s) fcj that have and nj offspring in the 
next generation. The expected values of n. 



and 



riiUj have been calculated by [Woodcock and Higgs| , 
and we only state the final result for completeness: 



E(n 2 ) 

EfaiHj) 



1 - s(h - (k)) , (13) 

2 — 1/N — 3s(ki — (k)) , (14) 

l-l/N - s(ki - (k)) - s(kj - (k)) . 

(15) 

In what follows, we have to distinguish between av- 
erages over a population, denoted by (...}, and av- 
erages over an ensemble of independent evolutionary 
histories, denoted by 777 . The ensemble-averaged 
moments of the distribution of fc,- are defined as 



Mr, 



1 N 



(k)y 



(16) 



For the second moment, we will in the following use 
the symbol V instead of M<i- 
Now, write 



( k )t+i = J^Yj n ^ kl + 



m. 



(17) 



With (|ll| ) and (|13|), and the equilibrium condition 

( k )t+i = ( k )t+i = we find 



(u + v)(k) + [1 - (u + v)]sV = uL . 
Similarly, from 



and 



(k)' 



t+l N 2 



(18) 
(19) 

(20) 



we find in equilibrium, using (|11|)-(|15| 



(11) V = [l-{u + v)] 2 [(l-l/N)V -sM 3 



+ [u(u - 1) - v(v - 1)] x [(1 - 1/N)(k) - sV] 
+ u(l-u)L(l-l/N). (21) 

These two equations contain three variables, and in 
general every equation of higher order will contain 
correspondingly higher moments, so that the set of 
equations can never be solved. This problem can be 
avoided with a suitable closure approximation. Fol- 
lowing Woodcock and Higgs| , we use 



M 3 = V(l-2(k)/L), (22) 
which is the expression for the third moment in an 



infinite population. With (|18|), fl2l|), and (|22|), (k) is 
uniquely determined, and we find 



(k) _ a 1 
T ~ 4 ~ 2 



a 



(23) 



with 



_ 3u + v - 2 1 2(u + v) - (u + v) 2 u-v 
a ~ u + v-l + 7N + s(u + v - l) 2 + N(u + v) ' 



(24) 



2u 



U + V 



2(u + v) - (u + v) 2 1 

" + -T7 + 



s(u + v — 1) 



+ 



1 u-1 
N u + v - 1 



sN u + v — 1 

(25) 



i=l 
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Figure 1: Average fraction of mutations in an infi- 
nite population as a function of u/s for various back 
mutation rates v. 

Figure 2: Average number of mutations (k) vs. the 
back mutation rate v , for N = 100, and s = u = 0.01. 
Solid lines represent the analytic prediction based on 
©> points represent simulation results, averaged over 
5 replicates. The errors in the simulation are of the 
order of the symbol size. 

Figure 3: Number of mutations in the least loaded 
class function of time, for N = 1000, L = 100, 
s = 0.1, u = 0.01, and v = 0. The thin solid lines rep- 
resent simulation results, the thick dashed line rep- 
resents the theoretical prediction obtained from (0). 

Figure 4: Average number of mutations in the pop- 
ulation vs. genomic mutation rate, for N = 500, 
L = 100, s = 0.1, and v = 0. The individual points 
represent results from simulation results, averaged 
over 5 replicates. The errors are of the order of the 
symbol size. The solid line represents the theoreti- 
cal prediction obtained from (|9|) , and the dashed line 
represents the moment expansion result (|23"|), The 
simulation results and equation @ were evaluated 
after t = 10 5 generations. The moment expansion 
predicts an equilibrium point, so time does not enter 
this equation. 

Figure 5: Average number of mutations in the pop- 
ulation vs. genomic mutation rate, for various popu- 
lation sizes N, and L = 100, s = 0.1, v = 0. The 
individual points represent results from simulation 
results, averaged over 5 replicates. The errors are 
of the order of the symbol size. 

Figure 6: Average number of mutation-free loci 
L — (k) vs. mutation rate u, for various length L and 
iV = 50, s = 0.1, v = 0. The individual points repre- 
sent results from simulation results, averaged over 5 
replicates. The errors are of the order of the symbol 
size. 
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